Very fast pipelined shifter element with parity prediction

ABSTRACT

A shifting structure and method which separates a shifting operation into partial shifts which may be executed in different pipeline staged is described herein. In a first pipe stage, an operand is read out and at least one partial shift is accomplished by placing the operand or parts thereof into registers coupled to a shift unit. The shift unit, in a second pipe stage, finalizes the shifting operation executing the remaining partial shifts, thereby reducing the time required for the total shifting operation. A control string is derived in the shift unit based on the shift amount to correct the output of the shifted result as well as providing for parity prediction therefor.

FIELD OF THE INVENTION

The invention relates to shifting of an operand in a shifting unit.

PRIOR ART

A shift register, or shift unit, has the ability to transfer informationin a lateral direction. Shift registers normally represent n-stagedevices whose output consist of an n-bit parallel data word. Applicationof a single clock cycle to the shift register device causes the outputword to be shifted by one bit position from right to left (or from leftto right). The leftmost (or rightmost) bit is lost from the "end" of theregister while the rightmost (or leftmost) bit position is loaded from aserial input terminal.

Shift registers with parallel outputs, and with combinational logic fedfrom those outputs, are of great importance in digital signalprocessing, and in the encoding and decoding of error-correcting anderror-detecting codes. Such registers may be implemented in hardware orin software, and may be binary or q-ary. (Hardware implementation isusually convenient only for binary and sometimes ternary logic.)

Shift units are commonly used together with arithmetical and logicalunits. FIG. 1 shows a block diagram of a prior art execution unitcomprising a shift unit together with an arithmetical and a logicalunit. All those units generally have to match in speed in order to avoidhaving unbalanced net length and signal propagation time. However,particularly the shift unit appears to be critical in matching the speedof the arithmetical and logical unit.

In certain applications, in order to provide a check on a group ofbinary values (e.g. a word, byte, or character), a parity function iscommonly computed by forming the modulo-2 sum of the bits in the group.The generated sum, a redundant value, is called the parity bit. Theparity bit is 0 if the number of 1s in the original group was even. Theparity bit is 1 if the number of 1s in the original group was odd.

The parity computation just defined will cause the augmented group ofbinary values (the original group plus the parity bit) to have an evennumber of 1s; this is called even parity. In some cases, hardwareconsiderations make it desirable to have an odd number of 1s in theaugmented group, and the parity bit is selected to cause the totalnumber of 1s to be odd; this is called odd parity. A parity check, orodd-even-check, is the computation, or recomputation for verification,of a parity bit to determine if a prescribed parity condition ispresent.

Shift units in applications comprising a parity function have to restorethe parity after each shifting operation since the sum of bits in arespective group might have been changed. This is generally accomplishedby a parity generation subsequent to the shifting operation. As anadditional security feature some applications further comprise a parityprediction function predicting the new parity bits independently fromthe parity generation of the shift unit. The generated and the predictedparity bits are compared and a defect in either the parity generation orprediction unit in case of a non-matching of both parity bits isapparent. However, such parity analysis require a certain amount ofprocessing time and are consequently especially critical with respect toa demanded matching in timing with other processing units.

As shown in FIG. 1, the operand to be shifted (e.g. 4 or 8 byte) isgenerally read from a Data Local Store DLS and put into operandregisters A REG or B REG. In a next cycle the data are processed ineither one of the processing units, e.g. shifted, and written back tothe Data In Register DI of the DLS. In parity checked systems a byteparity, e.g. the byte parity P0-P7, of the shifted data is generatedadding additional delay to the shifter path.

Shift operations usually, e.g. in IBM S/390 based computers (IBM andS/390 are registered trademarks of International Business MachinesCorporation) perform 4 or 8 byte shifts, right or left as logical orarithmetical operations. In addition a whole variety of special microinstructions are usually executed by the shift unit. The shift amount iscommonly split into 32-16-8-4-2-1 bit shifting elements (e.g. whenmultiplexers with up to 4 inputs are comprised) which are passed throughin consecutive order. Shift amounts between 0 and 63 bits are thuspossible. Levels may be bypassed by passing straight through. Shiftingto the right or left is done by applying the appropriate multiplexerlevels for shifting right/left.

Parity checking for the data path is commonly achieved by generating thebyte parity. For parity prediction, a parity bit of the complete doubleword (8 byte) is generated and compared with a predicted double wordparity bit. The predicted parity bit is achieved by selecting the byteparity bits of the bytes which are not shifted out. Other bytes arecompletely shifted out and replaced by zeroes, and one byte may bepartially affected by the shift (1 to 7 bits of a byte may be shiftedout).

Applying this scheme for odd-parity, the predicted parity is composed ofthe remaining byte parity bits, 1's for the byte parity bits which arecompletely shifted out and replaced by zeroes (odd parity assumed), andthe parity bit of the byte which is partially shifted. The originalparity bit of the partially shifted byte is flipped by each of its 1'swhich are shifted out.

Shift units for shifting up to 64 bits are generally composed of 6multiplexer levels and 2 XOR levels for parity generation. Shift unitsfor shifting up to 32 bits are composed of 5 multiplexer levels and 2XOR levels for parity generation. For parity prediction additional logiclevels are necessary.

The shift amount in byte (8 bit) oriented systems is commonly split into1-2-4-8-16-32-. . . -k/4-k/2 bit shifting elements which are passedthrough in consecutive order, usually starting with the largest shiftamount, so that shift amounts between 0 and k-1 bits in total are thuspossible. Shift amounts>=k for k bit words are usually meaningless sincethe word is then shifted out and represents 0's independent from theactual shift amount. However, the signals have to pass through severallevels of logic until the wanted result can be received. This is quitetime consuming and a great disadvantage.

SUMMARY OF THE INVENTION

It is an object of the invention to provide an improved shift unit.

The object of the invention is solved by the independent claims.

A new functional unit is introduced which is able to shift data invarious ways, and to perform also a truncation of data, either on theleft or on the right side. Speed advantage is achieved by exploitingpipelining, i.e. the execution is split into multiple cycles. Alloperations can be parity checked by predicting the correct parity.

In very general terms, the invention introduces a shifting structurewhich separates a shifting operation into partial shifts which can beexecuted in different pipeline stages. In a first pipe stage, an operandis read out and at least one partial shift is accomplished by placingthe operand or parts of the operand into registers coupled to a shiftunit. The shift unit, in a second pipe stage, eventually finalises theshifting operation by executing the remaining partial shifts. Thisreduces the time required for the shifting operation in total and alsoallows to distribute the partial shifts into the different pipelinestages in order to use a possible remaining period of time in a cycle.

Further reductions of the time required for the shifting operation inthe second cycle can be accomplished by a shift unit according to theinvention.

In the shift unit according to the invention, the operand to be shiftedis read, in a first pipe (pipeline) stage, from a data store and putinto one of two operand registers, each k/2 bits long, whereby krepresents an integer value. In a next cycle the data is processed, in asecond pipe stage, in the k bits long shift unit, and written back to adata register. An optional parity generation unit and an optional parityprediction unit can also be applied to the shift unit of the invention.

The shift structure according to the invention splits up the shiftinginto both of the two pipe stages. The data store contains data, witheach maximum of k/2 bit length, which can be read individually or aspairs with k bit length. The data store further comprises a multiplexingunit allowing to place each one of the data on either one of the operandregisters.

The functioning of the multiplexing unit is part of the shiftingfunction and controlled by an instruction control unit as known in theart. The multiplexing unit can provide a k/2 bit shifting by placingeach one of the k/2 bit data to be shifted in either one of the operandregisters and therefore on either the right or left side of the k bitshift unit, thus representing a k/2 bit shifting element. Consequently,the shift unit of the invention only requires 1-2-4-8-16-32-. . . -k/4bit shifting elements, thus saving one shifting element, and ergo oneshift level, with respect to shift units as known in the art.

When a k bit word, comprised of two k/2 data words from the data store,is to be shifted with a shift amount>=k/2 to the left (right), only therightmost (leftmost) k/2 data word needs to be read into the most left(right) one of the operand register, already representing a k/2 shift,during the cycle in the first pipe stage. Consecutive shifting is thenapplied by the shift unit during the next cycle in the second pipestage. Since the shift amount k/2 is already implemented in the firstpipe stage, only shift amounts with in total k/2-1 are required in theconsecutive shift unit of the second pipe stage.

Shifting with a shift amount<k/2 to the left or right is applied by onlythe shift unit during the cycle in the second pipe stage.

It is to be understood that the shift structure of the present inventionis not limited to only k/2 shifts in the first pipe stage. The data canalso be read out individually or in a combination into a plurality ofoperand registers. The multiplexing unit then places the read out datato be shifted in the plurality of operand registers. Dependent on thenumber of operand registers and their respective bit length, variouspossibilities of shifts can be accomplished. If, for example, fouroperand registers are provided, k/4 and/or k/2 shifts can be effected.It is to be understood that the number of shift levels and the shiftamounts in each of the pipe stages depends on the time period providedin each cycle.

In order to further reduce the number of shift levels and thus the timerequired for a shifting operation, the shift unit of the presentinvention preferably executes shifting operations in only one direction,e.g. only to the left. The operand bits are shifted in a circularmanner. Shift operations to the opposite direction are done by shiftingleft with a complement shift amount.

The shifting in a shift unit with a circular shift manner can be dividedinto consecutive shift levels. Each shift level generally allows acertain maximum number of shift gates n, p, q, etc. each with differentshift amounts, whereby n, p, q, etc. represent integer values. Forexample when CMOS technology is applied the maximum number of shiftgates usually is limited to four.

A first shift level allowing maximum n shift gates with the followingshift amounts: 0, k/(2n), 2 * k/(2n), 3 * k/(2n), 4 * k/(2n), . . . ,(n-2) * k/(2n), (n-1) * k/(2n), each with a distance of k/(2n) betweentwo shifting amounts next to each other. A second shift level allowsmaximum p shift gates with the shift amounts: 0, k/(2np), 2 * k/(2np),3 * k/(2np), 4 * k/(2np), . . . , (p-2) * k/(2np), (p-1) * k/(2np), eachwith a distance of k/(2np) between two shifting amounts next to eachother. A third shift level allows maximum q shift gates with the shiftamounts: 0, k/(2npq), 2 * k/(2npq), 3 * k/(2npq), 4 * k/(2npq), . . . ,(q-2) * k/(2npq), (q-1) * k/(2nq), each with a distance of k/(2npq)between two shifting amounts next to each other. Each consecutive shiftlevel divides the distance between two shifting amounts next to eachother into k/(2π), whereby π represents the product of the maximum shiftgates of the preceding until present shift levels. It is clear that eachshifting amount can only be an integer value and that the last shiftlevel ends with a shifting amount of 1.

As apparent from the above, the shift unit according to the inventionreduces the number of shifting levels at least by one. However, shiftunits with a circular shift manner allow a further remarkable reductionof the shift levels. As an additional feature required when k/2 bit datahave to be shifted is a duplication function within the multiplexingunit in the first pipe, allowing to duplicate the contents from theoperand registers.

The resulting data of the circular shift operation often need furthermanipulation in order to receive the same result as from a linear shiftoperation, e.g. leading/trailing zeroes or sign extension. Linearshifting of a data ABCD EFGH (with A to H representing an individualbyte) results in the remaining bits of the original data filled up withadditional 0's for the shift out bits. As an example, a shift 24 (24bits or 3 bytes) to the right for the data ABCD EFGH would result in thenew data 000A BCDE. However, shifting in a circular manner in thatexample would result in FGHA BCDE and consequently requires a certaintreatment in order to receive the same result as from the linearshifting. Such a treatment is preferably done by an individual string ofbit values as a control string comprised of bit or byte valuescorrecting the circular shift result into a linear shift result. Theshift amount is decoded to the string which defines the valid bits ofthe shift result.

The string also allows to control the optional parity prediction. Itdefines the validity of the data and optionally selects the parity bitsfor the parity prediction. The application of the string distinguishedlyreduces the amount of control logic required.

For the optional parity prediction, it is to be understood that thecounting of the 1's which are shifted out of the partially shifted bytecan preferably accomplished only on one side, preferably the side towhich the bits are shifted to in a circular manner. This requires acertain treatment of the data in the first pipeline stage, e.g.duplication of the data. This reduces the amount of logic circuits to bepassed through in the parity prediction logic.

The shift structure of the invention can be used e.g. in an executionunit for executing data manipulations in a processor unit which can bepart of a processor chip.

DESCRIPTION OF THE DRAWINGS

The invention will now be described by way of example and with referenceto the accompanying drawings in which:

FIG. 1 shows a block diagram of a prior art execution unit comprising ashift unit together with an arithmetical and a logical unit,

FIG. 2 shows a shifting unit according to the invention,

FIG. 3 shows the structure of one embodiment of the shift unit 10 with acircular shift manner,

FIG. 4a shows the structure of one embodiment of the shift unit 10 witha circular shift manner and parts 30a of the optional parity predictionunit 30,

FIG. 4b shows the structure of an embodiment of parts 30b of theoptional parity prediction unit 30 and of the optional parity generationunit 20,

DETAILED DESCRIPTION OF THE INVENTION

FIG. 2 shows a shifting unit according to the invention. It is to beunderstood that the shift unit of FIG. 2 can be part of the executionunit of FIG. 1. As already shown in FIG. 1, the operand to be shifted isread out, in pipe stage 1, from the Data Local Store DLS and put intothe operand registers A REG or B REG, each k/2 bits long. In a nextcycle the data are processed, in pipe stage 2, in a shift unit 10, andwritten back to the Data In Register DI of the DLS. FIG. 2 further showsan optional parity generation unit 20 and an optional parity predictionunit 30. Those functions will be explained later.

The shift structure according to the invention splits up the shiftinginto both of the two pipe stage 1 and 2. The data local store DLScontains data R0, R1, R2, R3, etc. with each maximum of k/2 bit lengthwhich can be read individually or as pairs with k bit length. The datalocal store DLS further comprises a multiplexing unit 40 allowing toplace each one of the data R0, R1, R2, R3, etc. on either register A REGor B REG.

The functioning of the multiplexing unit 40 is already part of theshifting function and controlled by an instruction control unit, notshown herein, as known in the art. The multiplexing unit 40 provides ak/2 bit shifting and therefore represents the k/2 bit shifting element.Consequently, the shift unit 10 of the invention only requires1-2-4-8-16-32-. . . -k/4 bit shifting elements, thus saving one shiftingelement, and ergo one shift level, with respect to shift units as knownin the art.

When a k bit word, comprised of two k/2 data words from the data localstore DLS, is to be shifted with a shift amount >=k/2 to the left(right), only the rightmost (leftmost) k/2 data word needs to be readinto register A REG (B REG), already representing a k/2 shift, duringthe cycle in pipe stage 1. Consecutive shifting is then applied by theshift unit 10 during the next cycle in pipe stage 2.

The shift unit 10 of an embodiment of the invention only executesshifting operations in one direction, i.e. only to the left. The operandbits are shifted in a circular manner. Further manipulation of theresulting data need, e.g. leading/trailing zeroes or sign extension, ispreferably done by a control string, which is explained later. Shiftoperations to the opposite direction are done by shifting left with acomplement shift amount CSA.

FIG. 3 shows the structure of one embodiment of the shift unit 10 with acircular shift manner, wherein k=64 and the maximum shifting gates aren=4 in each shift level. Since the shift amount k/2=32 is alreadyimplemented in the pipe stage 1, only shift amounts with in total 31 arerequired. The first shift level 100 allows the shifting amounts 0, 8,16, 24, each with a distance of k/(2n)=8 between two shifting gates nextto each other. The second shift level 110 allows the amounts 0, 2, 4, 6,each with a distance of k/(2nn)=2 between two shifting amounts next toeach other. In the third and last shift level 120, only shifting amountsof 0 and 1 are necessary.

Accordingly, another shift unit 10 with a circular shift manner, whereink=128 and the maximum shifting amount is n=4 in each shift level wouldrequire the following shift levels. The first shift level 100 allows theshifting amounts 0, 16, 32, 48, each with a distance of k/(2n)=16between two shifting amounts next to each other. The second shift level110 allows the amounts 0, 4, 8, 12, each with a distance of k/(2nn)=4between two shifting amounts next to each other. In the third and lastshift level 120, shifting amounts of 0, 1, 2 and 3 are necessary.

When k/2 bit data have to be shifted, the shifting structure requires asan additional feature a duplication function within the multiplexingunit 40, allowing to duplicate the content from register A REG intoregister B REG, and vice versa. Duplication might be necessary, e.g. forIBM S/390 instructions such as: insert character under mask (ICM),compare logical characters under mask (CLM), store characters under mask(STCM), or truncation (TRUNC) functions and for all shifts applyingparity prediction.

Shifting in a circular manner requires a certain treatment in order toreceive the same result as from a linear shifting. In a preferredembodiment of the invention, such a treatment is done by an individualstring of bit values. The shift amount is decoded to the bit stringwhich defines the valid bits of the shift result and also allows tocontrol the optional parity prediction. It defines the validity of thedata and optionally selects the parity bits for the parity prediction.

Example: A 64 Bit Shift Unit

An example of an embodiment will now be given in order to explain theinvention in greater detail. The embodiment comprises a structureaccording to FIG. 2 with a 64 bit shift unit. The shift unit of theexample is able to perform:

1) Shifts of 4 byte or 8 byte operands to the left or right. The shiftscan be arithmetical shifts (sign extension required) or logical shifts.The shift amount varies between 0-63 bits.

2) Byte operations like IBM S/390 instruction Insert Character under aMask ICM. Bytes from a contiguous field in storage equal to the lengthof the number of 1's in the mask are rearranged according to theposition of the 1's in the mask. The operand length is up to 4 bytes.See also Table 4.

    ______________________________________                                        Example       Storage operand                                                                              ABC                                                            Mask           1101                                                           Result         AB0C                                             ______________________________________                                    

3) The IBM S/390 instructions Store character under a mask (STCM): Bytesfrom a register are selected according to the mask and stored atcontiguous byte locations in the storage. See also Table 5.

    ______________________________________                                        Example         Register    ABCD                                                              Mask        0110                                                              Result      00BC                                              ______________________________________                                    

4) Truncation: 4 Byte operands are truncated on the left or right side.The truncation amount can be 0-31.

Shifting Structure of the Example

The above functions can be done through a shifting element applying 6levels of multiplexers, whereby a Level 1 is able to perform a shiftamount of 32 to the left or right or passing the data straight through.Additional levels perform shift amounts of 16, 8, 4, 2, 1. However, thesignals then have to pass through 6 levels of logic to get the result.This is quite time consuming and a great disadvantage.

FIG. 4a shows the structure of one embodiment of the shift unit 10 witha circular shift manner and parts 30a of the optional parity predictionunit 30. FIG. 4b shows the structure of an embodiment of parts 30b ofthe optional parity prediction unit 30 and of the optional paritygeneration unit 20. Only 3 levels of multiplexers perform the functionsof Table 1. The data are provided in the A and B Registers according toTable 1. Pipe stage 1 in FIG. 2 is able to provide the data as expectedin Table 1. This is necessary for the shifting unit of pipe level 2.

Shift Unit 10 of the Example

The shift unit 10 can perform all required functions with 3 multiplexerlevels, see FIG. 4a. Shifts with a shift amount SA>=32 are reduced inthe first pipe stage already to shifts with an amount 0-31. Three levels(SU level 1-3) of multiplexers are necessary to perform all shifts withSA 0-31 and in addition some more functions like the IBM S/390instructions ICM, CLM, STCM and a variety of micro instructions. SUlevel 1 does the byte shifts (8, 16, 24 to the left). These shiftinglevels are necessary for all byte shifts and especially for operationscontrolled by a mask, like ICM or STCM. SU level 2 does left shifts of2, 4, or 6 bits, and level 3 does only a shift of 1.

All 3 levels can be bypassed by activating the input STRAIGHT. Bits ofthe shift result are forced to zero by deactivating all gate signals atSU level 3. An AUXILIARY input is used to do the sign extension in caseof arithmetical shift rights or to insert the sign in case ofarithmetical shift left. All byte shifts are done at shift level 1,whereas all shifts with amount<=7 are done in level 2 and 3. Forexample, the ICM/STC needs only shift level 1, see Tables 4 and 5.

Embodiment of a String in the Example

The shift amount is decoded to a bit string which defines the valid bitsof the shift result and controls parity prediction. The string definesthe validity of the data and selects the parity bits for the parityprediction, see Table 3.

Table 1 shows the function of the control string S. The string is 32 bitwide, but controls the shift unit result of up to 64 bits. This is dueto the fact that the second halve of the string is either completely `0`or completely `1`. Logically the string can be considered as a string of64 bits. For logical shifts, where vacated bit positions are replaced byzeroes, or arithmetical shifts, where the sign bit is extended (shiftright arithmetical, SRA) the control string S(i) i=0-63 is applied, todefine the leading zeroes or to define bit positions carrying the sign.The control string S is generated from the shift amount SA, e.g. ifSA=16 then Si=1 with i=0-47 and Sj =0 with j=48-63. The string Scontains 48 1's at the left and 16 zeroes at the right side. Each bitposition i of Si controls the bit position i of level 3 of the shiftunit, as apparent from FIG. 4. If none of the inputs (STRAIGHT, LEFT 1,AUXILIARY) is activated the output of the MUXi is zero.

For shift left logical, the string can be directly applied. For shiftright logical, the string is swapped, so that 16 zeroes are at the leftand 48 1's at the right side, considering the above example.

For arithmetical shifts where the sign is extended (shift rightarithmetical, SRA) or the original sign keeps its position (shift leftarithmetical, SLA) the AUXILIARY input of MUX level 3 is used to forcethe sign bit. In case of SRA the sign is forced to all bit positionswhere S(i) i=0-63 carries a zero.

Table 1 shows IBM S/390 shift instructions, the applied control stringS, and the appropriate result in A REG and B REG of the operation inpipe level 1. It is assumed that double shifts access the operand ABCDEFGH and 4 byte shifts access only ABCD, whereby A-H represent a byteeach.

String support for parity prediction in the example

In FIG. 4b the parity prediction logic greatly reduces the amount ofcontrol logic which is generally necessary for parity prediction. Thegenerated double word parity (signal +GENERATED DOUBLE WORD PARITY) iscompared with a predicted double word parity. For that, the parity bitsof the original shift operand are selected according to the controlstring S, see Table 2. The selected parity bits generate a predicteddouble word parity which is again manipulated by sign insertion (in caseof arithmetic shift left) or sign extension (in case of arithmetic shiftright) and by the partially shifted out bits (signal PARITY OF PART.SHIFTED BYTE).

For right shifts, the bits 7, 15, 23, and 31 of the swapped controlstring define the parity bits of the original operand to be taken forparity prediction. For left shifts, the control string bits 0, 8, 16 and24 define the parity bits for the prediction.

                  TABLE 1                                                         ______________________________________                                        Tables                                                                        Overview of IBM S/390 shift instructions, applied                             string and expected operand in second pipe stage.                             Instruction                                                                             Shift amount                                                                             String    A REG  B REG                                   ______________________________________                                        SLL       >= 32      0000 0000 xxxx   xxxx                                    SLL       <32        ssss 0000 ABCD   xxxx                                    SLDL      >= 32      ssss 0000 EFGH   xxxx                                    SLDL      <32        1111 ssss ABCD   EFGH                                    SLA       >= 32      0000 0000 xxxxx  xxxx                                                         v                                                        SLA       <32        ssss 0000 ABCD   xxxx                                                         v                                                        SLDA      >= 32      ssss 0000 EFGH   xxxx                                                         v                                                        SLDA      <32        1111 ssss ABCD   EFGH                                    SRL       >= 32      0000 0000 xxxx   xxxx                                    SRL       <32        cccc 0000 ABCD   ABCD                                    SRDL      >= 32      0000 cccc ABCD   EFGH                                    SRDL      <32        cccc 1111 EFGH   ABCD                                    SRA       >= 32      0000 0000 xxxx   xxxx                                                         vvvv                                                     SRA       <32        cccc 0000 ABCD   ABCD                                                         v..                                                      SRDA      >= 32      0000 cccc ABCD   EFGH                                                         vvvv v...                                                SRDA      <32        cccc 1111 EFGH   ABCD                                                         v..                                                      TRUNC Right                                                                             <32        0000 ssss ABCD   ABCD                                    TRUNC left                                                                              <32        0000 cccc ABCD   ABCD                                    ICM/STCM             1111 0000 ABCD   ABCD                                    ______________________________________                                         Legend: s: valid bits of the control string                                   c: valid bits of swapped control string                                       v: sign bit in case of arithmetical shifts                                    x: Don't care                                                                 original data ABCD if 4 byte operand                                          original data ABCD EFGH if 8 byte operand                                

                  TABLE 2                                                         ______________________________________                                        Overview of selected operand parity bits for                                  parity prediction                                                                       Shift A REG parity B REG parity                                     Instruction amount  P0    P1  P2   P3  P4  P5  P6  P7                         ______________________________________                                                      control string applied                                          SLL,SRL,SLA,SRA                                                                           >=32     1     1   1   1   1   1   1   1                          SLL         <32     24    16   8   0   1   1   1   1                          SLDL        >=32    24    16   8   0   1   1   1   1                          SLA         <32     24    16   8   0   1   1   1   1                          SLDA        >=32    24    16   8   0   1   1   1   1                          SLDL        <32     24    16   8   0   t   t   t   t                          SLDA        <32     24    16   8   0   t   t   t   t                                        swapped control string applied                                  SRL         <32     31    23  15   7   1   1   1   1                          SRDL        >=32    31    23  15   7   1   1   1   1                          SRA         <32     31    23  15   7   1   1   1   1                          SRDA        >=32    31    23  15   7   1   1   1   4                          SRDL        <32     31    23  15   7   t   t   t   t                          SRDA        <32     31    23  15   7   t   t   t   t                          ______________________________________                                         Legend:                                                                       Pi i = 0-7 parity of original operand                                         1: appropriate parity bit forced to one                                       t: parity bit taken                                                      

                  TABLE 3                                                         ______________________________________                                        Decoding of the shift amount into a string of 32 bits                         Shift amount String 0-31                                                      ______________________________________                                        0            11111111111111111111111111111111                                 1            11111111111111111111111111111110                                 2            11111111111111111111111111111100                                 3            11111111111111111111111111111000                                 4            11111111111111111111111111110000                                 5            11111111111111111111111111100000                                 6            11111111111111111111111111000000                                 7            11111111111111111111111110000000                                 .            .                   .                                            .            .                   .                                            .            .                   .                                            30           11000000000000000000000000000000                                 31           10000000000000000000000000000000                                 ______________________________________                                    

                  TABLE 4                                                         ______________________________________                                        Shift operations for Insert Character under Mask,                             Pipe stage 2.                                                                 A Reg             B Reg                                                       ICM   By0    By1    By2  By3  By4  By5  By6  By7  Mask                        ______________________________________                                                                                  0000                                                         ST               0001                                                         L8               0010                                                    ST   ST               0011                                                         L16              0100                                                    L8   ST               0101                                                    L8   L8               0110                                             ST     ST   ST               0111                                                         L24              1000                                                    L16  ST               1001                                                    L16  L8               1010                                             L8     ST   ST               1011                                                    L16  L16              1100                                             L8     L8   ST               1101                                             L8     L8   L8               1110                                      ST     ST     ST   ST               1111                                ______________________________________                                         Legend:                                                                       ST: appropriate Byte moved straight                                           L8, L16, L24: appropriate Byte shifted left 8, 16, 24                    

                  TABLE 5                                                         ______________________________________                                        Shift operations for Store Character under Mask,                              Pipe stage 2.                                                                 A Reg             B Reg                                                       STCM  By0    By1    By2  By3  By4  By5  By6  By7  Mask                        ______________________________________                                                                                      0000                                                   ST                     0001                                                   X              L24     0010                                            ST     ST                     0011                                                   X         L16          0100                                            X      ST        L24          0101                                            X      X         L24  L24     0110                                     ST     ST     ST                     0111                                                   X    L8                1000                                            X      ST   L16               1001                                            X      X    L16       L24     1010                                     X      ST     ST   L24               1011                                            X      X    L16  L16          1100                                     X      X      ST   L24  L24          1101                                     X      X      X    L24  L24  L24     1110                            ST       ST     ST     ST                     1111                            ______________________________________                                         Legend:                                                                       ST: appropriate Byte moved straight                                           L8, L16, L24: appropriate Byte shifted left 8, 16, 24 from B Reg              X: Position, where the appropriate Byte is shifted to.                   

Further Examples

EXAMPLE 1)

Shift left double SLDL (8 byte operand from register pair Ri, Ri+1,i=even) with shift amount SA<32

RA addresses register Ri and RB addresses register Ri+1. Thus theoriginal operand ABCD EFGH is put to AREG as ABCD and BREG as EFGH.

    ______________________________________                                        SA = 8:                                                                              original Opnd.                                                                           X'01   23  45  67   89  AB  CD  EF'                                result pipe 1                                                                            X'01   23  45  67   89  AB  CD  EF'                                shifted Opnd.                                                                            X'23   45  67  89   AB  CD  EF  01'                                String S   X'FF   FF  FF  FF   FF  FF  FF  00'                                Shifter output                                                                           X'23   45  67  89   AB  CD  EF  00'                         ______________________________________                                    

EXAMPLE 2)

SLDL with SA>=32

RA addresses Ri+1 and RB addresses Ri. Thus the AREG contains EFGH andBREG contains ABCD. A circular shift of 32 bit positions has takenplace, without adding delay. The original operand ABCD EFGH is read toA/BREG as EFGH ABCD.

    ______________________________________                                        SA = 48:                                                                             orig. Opnd.                                                                              X'01   23  45  67   89  AB  CD  EF'                                result pipe 1                                                                            X'89   AB  CD  EF   01  23  45  67'                                shifted Opnd.                                                                            X'CD   EF  01  23   45  67  89  AB'                                String S   X'FF   FF  00  00   00  00  00  00'                                Shifter output                                                                           X'CD   EF  00  00   00  00  00  00'                         ______________________________________                                    

EXAMPLE 3)

Shift right double SRDL (8 byte) with SA<32

RA addresses Ri+1 and RB addresses Ri. Thus the original data ABCD EFGHappear in the AREG as EFGH and BREG as ABCD. This swap is necessary asshifting right is done by a circular left shift with the complement ofthe shift amount. As shift 32 is done in pipe 1, CSA=32-SA.

    ______________________________________                                        SA = 20:                                                                             orig. Opnd. X'01   23  45  67   89  AB  CD  EF'                        CSA = 12:                                                                            result pipe 1                                                                             X'89   AB  CD  EF   01  23  45  67'                               shifted Opnd.                                                                             X'BC   DE  F0  12   34  56  78  9A'                               String S    X'FF   F0  00  00   FF  FF  FF  FF'                               String S swapped                                                                          X'00   00  0F  FF   FF  FF  FF  FF'                               Shifter output                                                                            X'00   00  00  12   34  56  78  9A'                        ______________________________________                                    

EXAMPLE 4)

Shift right double arithmetic SRDA (8 byte) with SA<32

RA addresses Ri+1 and RB addresses Ri. Thus the original data ABCD EFGHappear in the AREG as EFGH and BREG as ABCD. This swap is necessary asshifting right is done by a circular left shift with the complement ofthe shift amount. As shift 32 is done in pipe 1, CSA=32-SA. Inarithmetical shifts the sign has to be extended. All bit positions ofthe string S carrying zero point to bit positions where the sign isextended to. The sign is forced at the AUXILIARY input of SU level 3.

    ______________________________________                                        SA = 20:                                                                             orig. Opnd. (neg.)                                                                        X'81   23  45  67   89  AB  CD  EF'                        CSA = 12:                                                                            result of pipe 1                                                                          X'89   AB  CD  EF   81  23  45  67'                               shifted Opnd.                                                                             X'BC   DE  F8  12   34  56  78  9A'                               String S    X'FF   F0  00  00   FF  FF  FF  FF'                               String S swapped                                                                          X'00   00  0F  FF   FF  FF  FF  FF'                               Shifter output                                                                            X'FF   FF  F8  12   34  56  78  9A'                        ______________________________________                                    

EXAMPLE 5)

Shift right arithmetical SRA (4 byte) with SA=19

For all instructions executed by the shift unit 10 which apply only 4byte, RA and RB address the same register. Thus the operand isduplicated to the AREG and BREG. Duplication is also necessary forparity prediction. The bits shifted out of the partially shifted byteare sensed only at shift unit byte 0 position, see FIGS. 4. Table 6shows as an example the shift right arithmetical SRA (4 byte) operationwith SA=19.

EXAMPLE 6) SRA with SA<32 of Example 5

Table 2 shows the selection of the parity bits which take part in theparity prediction. As in example 5 as above, SRA with SA<32 isexplained. The byte parity is assumed to be odd.

Table 7 shows as an example a shift right arithmetical SRA (4 byte)operation with SA<32. Bit position CSA (31, 23) select P0=1 and P1=0,P2, P3, and P4-P7 forced 1. The predicted double word parity PD iscomposed of P0-P7. Since the shift amount SA is odd, an odd number ofsign bits are extended. Furthermore the parity of the partially shiftedbyte has to be considered for the final predicted parity.

                                      TABLE 6                                     __________________________________________________________________________    Shift right arithimtical SRA (4 byte) with SA = 19.                           __________________________________________________________________________    SA = 19                                                                            orig. Opnd                                                                            X'A5 B6 C7 D8'                                                   CSA = 13                                                                           result of pipe 1                                                                      X'A5 B6 C7 D8                                                                          A5 B6 C7 D8'                                            Bit position:                                                                              111111                                                                            11112222                                                                           22222233                                                                            33333333                                                                            44444444                                                                            44555555                                                                            55556666                                01234567                                                                          89012345                                                                           67890123                                                                           45678901                                                                            23456789                                                                            01234567                                                                            89012345                                                                            67890123                        pipe 1                                                                              B'10100101                                                                          10110110                                                                           11000111                                                                           11011000                                                                            10100101                                                                            10110110                                                                            11000111                                                                            11011000'                       shft 8 lvl 1                                                                        B'10110110                                                                          11000111                                                                           11011000                                                                           10100101                                                                            10110110                                                                            11000111                                                                            11011000                                                                            10100101'                                  **                                                                 shft 4 lvl 2                                                                        B'01101100                                                                          01111101                                                                           10001010                                                                           01011011                                                                            01101100                                                                            01111011                                                                            10001010                                                                            01011011'                               *                                                                     shft 1 lvl 3                                                                        B'11011000                                                                          11111011                                                                           00010100                                                                           10110110                                                                            11011000                                                                            11110111                                                                            00010100                                                                            10110110'                       SA    B'11111111                                                                          11111111                                                                           11100000                                                                           00000000                                                                            00000000                                                                            00000000                                                                            00000000                                                                            00000000'                       CSA   B'00000000                                                                          00000000                                                                           00011111                                                                           11111111                                                                            11111111                                                                            11111111                                                                            11111111                                                                            11111111'                       result                                                                              B'00000000                                                                          00000000                                                                           00010100                                                                           10110110                                                                            11011000                                                                            11111011                                                                            00010100                                                                            10110110'                       sign ext.                                                                           B'11111111                                                                          11111111                                                                           111- - - - -                                                 final result                                                                        B'11111111                                                                          11111111                                                                           11110100                                                                           10110110                                                                            11011000                                                                            11111011                                                                            00010100                                                                            10110110'                       __________________________________________________________________________     The asterisk (*) shows the bits of the partially shifted byte X'B6' which     are used for parity prediction.                                          

                                      TABLE 7                                     __________________________________________________________________________    Shift right arithmetical SRA with SA < 32                                     __________________________________________________________________________    SA = 19                                                                            orig. Opnd   X'A5                                                                             B6                                                                              C7                                                                              D8'                                                       Parity       1  0 0 1                                                    CSA = 13                                                                           result of pipe 1                                                                      A REG =                                                                            X'A5                                                                             B6                                                                              C7                                                                              D8' B REG =                                                                            X'A5                                                                             B6                                                                              C7                                                                              D8'                                       parity       1  0 0 1        1  0 0 1                                    Bit position:                                                                              111111                                                                            11112222                                                                           22222233                                                                           33333333                                                                           44444444                                                                           44555555                                                                           55556666                                  01234567                                                                            89012345                                                                           67890123                                                                           45678901                                                                           23456789                                                                           01234567                                                                           89012345                                                                           67890123                            pipe 1                                                                              B'10100101                                                                          10110110                                                                           11000111                                                                           11011000                                                                           10100101                                                                           10110110                                                                           11000111                                                                           11011000'                           parity                                                                              P0 = 1                                                                              P1 = 0                                                                             P2 = 0                                                                             P3 = 1                                                                             P4 = 1                                                                             P5 = 0                                                                             P6 = 0                                                                             P7 = 1                              SA    B'11111111                                                                          11111111                                                                           11100000                                                                           00000000                                                                           00000000                                                                           00000000                                                                           00000000                                                                           00000000'                           CSA   B'00000000                                                                          00000000                                                                           00011111                                                                           11111111                                                                           11111111                                                                           11111111                                                                           11111111                                                                           11111111'                           __________________________________________________________________________

We claim:
 1. A shift structure for executing shifting operationscomprising:a data store (DLS) containing data (R0-R3) which can be readout individually or in a combination into a plurality of operandregisters (A REG, B REG); and a shift unit (10) coupled to the output ofthe operand registers (A REG, B REG); characterised bya multiplexingunit (40) coupled to the output of the data store (DLS) for providing atleast one shifting operation, when required, by placing the read outdata (R0-R3) to be shifted in the plurality of operand registers (A REG,B REG); said shift structure further comprising two successive pipelinestages, wherein the data store (DLS) and the multiplexing unit (40) arein a first one of said pipeline stages, and the shift unit (10) is in asecond one of said pipeline stages.
 2. The shift structure of claim 1,characterised in that the data store (DLS) containing data (RO-R3), eachhaving a maximum k/2 bit length, which can be read individually or aspairs with maximum k bit length into a pair of the operand registers (AREG, B REG), each with k/2 bit length; andthe shift unit (10) with k bitlength is coupled to the pair of operand registers (A REG, B REG) andcomprising a plurality of shifting elements; the multiplexing unit (40)providing a k/2 bit shifting, when required, by placing each one of thek/2 bit data (R0-R3) to be shifted in at least one of the operandregisters (A REG, B REG); and the shift unit (10) only requires shiftingelements with a maximum shift amount of k/4.
 3. The shift structureaccording to claim 1, characterized in that the multiplexing unit (40)comprises a duplication unit for duplicating the contents of the operandregisters (A REG, B REG).
 4. A shift unit (10) with k bit length whichcan be used in the shift structure according to claim 1, whereby theshift unit (10) executes shifting operations in only one direction in acircular manner, characterized in that:the shift unit (10) comprises aplurality of consecutive shift levels each comprising a number n, p, q,etc. of shift gates, each with different shift amounts, whereby n, p, q,etc. and the shifting amounts represent integer values,whereby in afirst shift level, the shift gates comprise the shift amounts 0, k/2(n),2 * k/2(n), 3 * k/2(n), 4 * k/2(n), . . . ,(n-2) * k/2(n), (n-1) *k/2(n), each with a distance of k/2(n) between two shifting amounts nextto each other; and in each consecutive shift level the distance betweentwo shifting amounts, next to each other, is divided into k/2 (π),whereby π0 represents the product of the number of shift gates in thepreceding shift levels up to the present shift level.
 5. The shift unit(10) of claim 4, characterised in that the last shift level ends with ashifting amount of
 1. 6. The shift unit (10) according to claim 4,characterised by a string comprised of individual bit or byte values forcorrecting the circular shift result into a linear shift result, wherebythe shift amount to be shifted is decoded to the string defining thevalid bits of the shift result.
 7. The shift unit (10) according toclaim 6, further comprising a parity prediction unit (30).
 8. The shiftunit (10) according to claim 7, characterised in that in the predictionunit (30) the counting of the bits which are shifted out of a partiallyshifted byte is accomplished only on one side.
 9. The shift unit (10)according to claim 8, characterised in that in the prediction unit (30)the counting of the bits which are shifted out of a partially shiftedbyte is accomplished only on the side to which the bits are shifted toin a circular manner.
 10. The shift unit (10) according to claim 7,characterised in that the string allows to control a parity predictionby selecting parity bits for the parity prediction in the parityprediction unit (30).
 11. An execution unit for executing datamanipulations in a processor unit characterized by a shift unit (10)according to claim
 1. 12. A processor chip characterized by a anexecution unit according to claim
 11. 13. A method for executing ashifting operation comprising the steps of:reading out an operand to beshifted from a data store (DLS); putting the operand, by a multiplexingunit (40), into a plurality of operand registers (R0-R3), thus executingat least one partial shift, when required; shifting the operand by ashift unit (10) by executing the remaining partial shifts; wherein thesteps of reading out the operand to be shifted and putting the operandinto a plurality of operand registers are executed in a first pipe stageand during a first cycle; and the step of shifting the operand by ashift unit (10) by executing the remaining partial shifts is executed ina second pipe stage during a next cycle.
 14. The method of claim 13,wherein:the operand to be shifted is put into one of a pair of theoperand registers, each k/2 bits long with k representing an integervalue, thus executing a k/2 bit shift when required; the shifting of theoperand is executed by consecutive shifting of the remaining partialshifts in a k bits long shift unit.
 15. The method of claim 14, whereinfor shifting with a shift amount greater than or equal to k/2 to theleft or right, only the rightmost or leftmost k/2 data wordrespectively, is read out in to the most left or right one of the pairof operand registers respectively, representing a k/2 shift during thecycle in the first pipe stage.
 16. Use of the method according to claim13 in an execution unit for executing data manipulations in a processorunit.
 17. Use of the method according to claim 13 in a processor chip.18. A shift structure for executing shifting operations comprising:adata store (DLS) containing data (R0-R3) which can be read outindividually or in a combination into a plurality of operand registers(A REG, B REG); and a shift unit (10) coupled to the output of theoperand registers (A REG, B REG), said shift unit capable of shiftingthe output of said operand registers by a predetermined shift amount(SA), said shift unit further deriving from said predetermined shiftamount (SA) a control string, said control string being used to correctthe output of said shift unit.
 19. The shift structure of claim 18,wherein the control string is comprised of individual bit values forcorrecting a circular shift result into a linear shift result.
 20. Theshift structure of claim 18, wherein the control string is comprised ofindividual bit values for controlling a parity prediction unit byselecting parity bits.